Discovering Outliers of Potential Drug Toxicities Using a Large-scale Data-driven Approach
نویسندگان
چکیده
We systematically compared the adverse effects of cancer drugs to detect event outliers across different clinical trials using a data-driven approach. Because many cancer drugs are toxic to patients, better understanding of adverse events of cancer drugs is critical for developing therapies that could minimize the toxic effects. However, due to the large variabilities of adverse events across different cancer drugs, methods to efficiently compare adverse effects across different cancer drugs are lacking. To address this challenge, we present an exploration study that integrates multiple adverse event reports from clinical trials in order to systematically compare adverse events across different cancer drugs. To demonstrate our methods, we first collected data on 186,339 clinical trials from ClinicalTrials.gov and selected 30 common cancer drugs. We identified 1602 cancer trials that studied the selected cancer drugs. Our methods effectively extracted 12,922 distinct adverse events from the clinical trial reports. Using the extracted data, we ranked all 12,922 adverse events based on their prevalence in the clinical trials, such as nausea 82%, fatigue 77%, and vomiting 75.97%. To detect the significant drug outliers that could have a statistically high possibility of causing an event, we used the boxplot method to visualize adverse event outliers across different drugs and applied Grubbs' test to evaluate the significance. Analyses showed that by systematically integrating cross-trial data from multiple clinical trial reports, adverse event outliers associated with cancer drugs can be detected. The method was demonstrated by detecting the following four statistically significant adverse event cases: the association of the drug axitinib with hypertension (Grubbs' test, P < 0.001), the association of the drug imatinib with muscle spasm (P < 0.001), the association of the drug vorinostat with deep vein thrombosis (P < 0.001), and the association of the drug afatinib with paronychia (P < 0.01).
منابع مشابه
A comparison between knowledge-driven fuzzy and data-driven artificial neural network approaches for prospecting porphyry Cu mineralization; a case study of Shahr-e-Babak area, Kerman Province, SE Iran
The study area, located in the southern section of the Central Iranian volcano–sedimentary complex, contains a large number of mineral deposits and occurrences which is currently facing a shortage of resources. Therefore, the prospecting potential areas in the deeper and peripheral spaces has become a high priority in this region. Different direct and indirect methods try to predict promising a...
متن کاملA New GIS based Application of Sequential Technique to Prospect Karstic Groundwater using Remotely Sensed and Geoelectrical Methods in Karstified Tepal Area, Shahrood, Iran
In this research, recognition of karstic water-bearing zones using the management of exploration data in Kal-Qorno valley, situated in the Tepal area of Shahrood, has been considered. For this purpose, the sequential exploration method was conducted using geological evidences and applying remote sensing and geoelectrical resistivity methods in two major phases including the regional and local s...
متن کاملFinding Inner Outliers in High Dimensional Space
Outlier detection in a large-scale database is a significant and complex issue in knowledge discovering field. As the data distributions are obscure and uncertain in high dimensional space, most existing solutions try to solve the issue taking into account the two intuitive points: first, outliers are extremely far away from other points in high dimensional space; second, outliers are detected ...
متن کاملA clustering approach for mineral potential mapping: A deposit-scale porphyry copper exploration targeting
This work describes a knowledge-guided clustering approach for mineral potential mapping (MPM), by which the optimum number of clusters is derived form a knowledge-driven methodology through a concentration-area (C-A) multifractal analysis. To implement the proposed approach, a case study at the North Narbaghi region in the Saveh, Markazi province of Iran, was investigated to discover porphyry ...
متن کاملRobust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 15 شماره
صفحات -
تاریخ انتشار 2016